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Abstract. We study some self-interacting diffusions living on R'* solutions to: 

AXt = dBt - g{t)VV{Xt - JlMt 



where Ji^ is the empirical mean of the process X,V is. an asymptotically strictly convex 
potential and g is a given function, not increasing too fast to the infinity or constant. 
The authors have already proved that the ergodic behavior of X is strongly related to 
g. We go further and, using the simulated annealing method, we give some conditions 
for the convergence in distribution of X toward X^ (which law is related to the global 
minima of V). We also investigate the case g{t) — 1. 



In |3], the authors have obtained some conditions for both the pointwise ergodicity 
and the almost sure convergence of some self-interacting diffusions. We will go further 
in the study of such processes. The aim of this paper is to obtain some conditions first, 
for the convergence in probability, and second, the convergence in distribution of the 
self-interacting diffusion X defined by 



where i? is a standard Brownian motion and /i^ denotes the empirical mean of X: 



Here is an initial probability measure on M°', /2 denotes the mean of /i and r > is an 
initial weight. 

This paper deals with the well-known theory of simulated annealing, which has been 
developed since the 80 's. For physical systems, an important question is to find the 
globally minimum energy states of the system. Experimentally, the ground states are 
reached by a procedure, called the chemical annealing. Let us explain the procedure. 
One first melts a substance and then cools it slowly enough to pass through the freezing 
temperature. If the temperature decreases too fast, then the system does not end up 
into a ground state, but in a local (but not global) minimum. On the other hand, if 
the temperature decreases too slowly, then the system approaches the ground states very 
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1. Introduction 




dXt = dBt - 9{t)VV{Xt - ]it)dt, Xo = x 
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slowly. The competition between these two effects determines the optimal speed of cooling, 
that is the annealing schedule. 

The study of the simulated annealing involves the theory of (homogeneous and) non- 
homogeneous Markov chains and diffusion processes, large deviation theory, spectral anal- 
ysis of operator and singular perturbation theory. Pioneer work has been done by Freidlin 
and Wentzell f6]. The initial problem consists in finding the global minima of a function 
U. Indeed, one has to study the Markov process in M'^ given by the Langevin-type 
Markov diffusion (we emphasize that e = £{t)) 

(1.3) dXf = edBt - VU{X')dt. 

may be considered as a perturbation of the trajectory of the dynamical system 
= — Vf/(X°). Let us explain briefly the model. If the temperature e is almost con- 
stant for a sufficiently large amount of time, then the process X^ and the fixed temperature 
process behave approximatively the same at the end of that time-interval. Denote by min 
the set of all the global minima of U. The optimal annealing schedule (that is, e), for the 
convergence criterion Pa;(-^f ^ min) — >■ 1 as t goes to +oo, was first determined by Hajek 
[7] for a finite state space. Later, Chiang, Hwang and Sheu studied the convergence 
rate of the latter probability via the large deviations of the transition density of X^. This 
rate is actually strongly related to the spectral gap of the invariant measure of X^. 

Note, that Chiang, Wang and Sheu were one of the firsts to show the convergence of 
the algorithm of the simulated annealing, in the case e(t)^ = fc/logt for k large enough. 
Later, Royer [TB] obtained the same result for k > A, where A is related to the second 
eigenvalue of the corresponding infinitesimal generator. Moreover, Hwang and Sheu 
fn\ established (by probabilistic methods) the existence of A := lim — £:^logA|. Finally, 

Holley and Stroock [10] initiated an other method and proved, in the discrete case, the 
convergence of the simulated annealing algorithm via Sobolev's inequality. They went 
further in their study with Kusuoka [8]. After that, Miclo proved, by using some 
functional inequalities, that the free energy (that is the relative entropy of the distribution 
of the process at time t with respect to the invariant probability at that time t) satisfies 
a differential inequality, implying (under some decreasing evolution of the temperature to 
zero) the convergence of the process to the global minima of the potential. 

A natural question arises: what happens if the temperature, that is e, decreases too 
fast to zero? Then, the potential can freeze in a local minimum (the "choice" of this 
minimum depends on the initial condition) and therefore the process converges to this 
local minimum. We won't consider this case here. 

First, we study the process Yt := Xt — fit, which satisfies the following SDE 

/ dYt = dBt - gmViYt)dt - F,^; Yo = x-JI; 

We will adapt the simulated annealing method to Y. We will prove that, depending on g, 
either the process Y converges in distribution (and not in probability) towards a variable 
which is concentrated on the global minima of V or converges in probability to a random 
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variable, which support is Mf^. Suppose that V = W + where W is strictly convex 
and X is a compactly supported function. Define osc(x) := supx(a^) — i^^xi^)- Denote 

by (^«j)i<j<„, respectively (Mj)i<j<p, the local minima, respectively maxima and saddle 
points of V. We have proved in [3] , that there exist some nonnegative constants such 
that ttj = 1 and for all continuous bounded /: j f(Ys)ds ^aj/(mj). Here are 
the main results of the present work, corresponding to two classes of functions g. First, 
we prove, for a logarithmic g, that X converges under a condition of symmetry on the 
critical points of V: 

Theorem 1.1. Suppose oo > lim5f(t)~-^ logG(t) = k > max{2 osc(x), 'i/4}. Then the 

process X converges in distribution to + Ys:^ if and only if V is such that 
^ ttirrii = 0. 

l<i<p 

Second, we show that, for a constant g, X converges under a condition of symmetry on 

V: 



Theorem 1.2. Suppose that limg{t) = 1. Then fi^ converges in probability and X, 



converges in probability to fi^ + Y^o, where Y^o has the normalized distribution density 
g-2y(x)^2'j if and only if J xe~'^^^^^dx = 0. Else X diverges. 

The paper is organized in the following way. In Section 2, we recall the notations and 
some results of [3]. Section 3 is devoted to the study of the process Y. In particular, we 
prove the convergence in distribution of Y towards global minima thanks to the simulated 
annealing method. Afterwards, we deduce in Section 4 some conditions for the conver- 
gence of the self-interacting process X. Finally, we will study the constant case g = 1 in 
Section 5. 

2. Notation, hypothesis and former results 

We denote by G the function G{t) := g{s)ds and is its generalized inverse: 
G^^(t) := inf{M > 0; G{u) > t}. In the whole following, (-, ■) denotes the Euclidian scalar 
product. We denote by V{W^) the set of probability measures on M'^. 

In the sequel, we suppose that V : M'^ ^ M+ is such that: 

(1) {regularity and positivity) V G C^(R°') and V >Q] 

(2) {convexity) V = W + x where x is a compactly supported function and there exists 
c > such that V'^W{x) > old and Vx is Lipschitz (with the constant C > 0); 

(3) {growth) there exist a,b > such that for all x G M*^, we have 

\W(x)\'^ 

(2.1) AV{x) <a + bV{x) and lim ^ = oo. 

\x\-^oo V{x) 

We also assume that V has a finite number of critical points. Let Max = {Mi, M2, ■ ■ ■ , Mp} 
be the set of saddle points and local maxima of V and Min = {mi,m2, ■ ■ ■ , m„} be the 
set of the local minima of V. We assume that Vi, V.^ G M"', (V'^V{mi)^,^) > and for all 
Mi, V'^V{Mi) admits a negative eigenvalue. 
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We also assume that the apphcation g : ]R_|_ IR+ belongs to C^(]R4.) and, without any 
loss of generality, that (?(0) > 0. In the following, we will consider the cases g{t) = klogt 
and g{t) = 1. 

Remark 2.1. Ifhmg{t) = oo, then for allT > 0, we have that G'^{t+T)-G~^{t) — ^ 0. 

We have already shown that the SDE (ll.ip studied admits a unique global strong 
solution: 

Proposition 2.2. (]3] proposition 3.3) For any x E W^, (x G V(M.'^) and r > 0, there 
exists a unique global strong solution {Xt,t > 0) of (11. ip . 

Theorem 2.3. (theorem 5.6) Suppose that limg{t) = oo and limg'{t)/ g'^{t) = 0. Then, 
a.s. the normalized occupation measure ofY converges weakly to a convex combination of 
Dirac measures taken in the critical points ofV: there exist > such that Xli'^j — ^ 
and for all continuous bounded function f , we have a.s. 

1 /■* 

- / /(F,)d. — .5^a./(m,). 

''^ i=l 



3. Asymptotic behavior of Y 
Consider the time-changed process Zf := YQ-i(^t-j, satisfying the following SDE 

where is a Brownian motion such that , ^ =(iWt has the same law as -BrT-if^O We 



identify 1/ ^Jg o G^^{t) as the temperature in the simulated annealing model. So, define 
^^(^) goG-i(t) ' '^^^ process Z reads 



(3.2) 



dZ, = smBt - (VK(Z,) + dt. 



In this part, we suppose that ^"^^^l*-* is bounded and that g'{t)/g{tY converges to zero. 

3.1. Tightness. We begin to prove that the law of the process Z is a tight family of 
measures. 

Lemma 3.1. (Chiang- Hwang- Sheu) There exist some R,c>0 such that 

E{V{Zt)ll{v(z,)>R}) < " 



goG-i{ty 



^The Wiener processes Bt and Wt are not the same, but this does not matter because we are only 
interested in the probabihty distribution. 
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Proof. Let us first exhibit a constant 7 such that E,{V{Zt)) < E,V{Zq) + 7t. By Ito's 
formula, we have 



^^V{Zt) = -E 



AV{Z, 



2 u>.v yzjt) , ,2 



eity 



As |V\^|^ — Ay is bounded from below, the first assertion follows. Now, we adapt the 
proof given by Dufio [5j. The growth condition (12. ip implies that there exists tq such 
that, for V[x) > tq, we get |Vl^(x)p > 2cV{x) where c > is a constant. Let be a 
nonnegative and nondecreasing function of class C^, : M ^ [0, 1] such that (t){x) = for 
X < To, (j){x) = 1 for X > R where ro < -R < 00. Remark, that the continuous function 
V(0o\/) = ((j)' oV)'W is bounded and consider the application ip := {(l)oV)V. We apply 
Ito's formula to the function x 1— >■ ijj{x) and we get 

dij{Zt) = e{t){V^P{Zt),dWt) + ^A^lJ{Zt)dt 

~ (Illy + ° + 

Let a(t) := E[0 o V{Zt)V{Zt)]. By the first assertion, a is well defined. Due to the 
compact support of cp' and because we can decompose Aip as 

AiPiz) = 0" o V{z)V{z)\VV{z)\'^ + (f)'o V{z)V{z)AV{z) 

+ 2(f)' oV{z)\VV{z)\'^ + (f)oV{z)AV{z), 

there exists C > such that Aip < C {{(p o V)V + 1) . So, we get the bound 

/ eisYE{Aij{Z,))ds < C {G-\t + h) - G-\t)) + C a{s)e{sYds. 
Jt Jt 

On the other hand, we have the lower bound 

E [(0 o V{Zt) + V{Zt)(j)' o V{Zt)) \VV{Zt)\''] 

> 2cE [(0 o V{Zt) + V{Zt)^' o V{Zt)) V{Zt)] > 2ca{t). 

For r > ro large enough, if V{z) > r then {z, W{z)) > 0, so we get 
E ((0 o V{Zt) + V{Zt)(j)' o V{Zt)) {Zt, VV{Zt))) > 0. 

Therefore, the preceding Ito's formula leads to 

a{s)ds + — J^ a{s)e{syds 

Letting h go to zero, this yields to a'{t) < —2ca{t) + + l)e(t)^. Choose r large 

enough, so that we have 

C 

a'{t) < -ca{t) + -^£{tf- 
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In order to solve this inequation, let a{t) := P{t)e~^. We have P{t) < ^^£:(s)^ds. 
As hnit^+oo = 0, this yields to a{t) < CisitY /c where Ci is a positive constant 
independent of c. To conclude, we just need to remark that l{y>i?,} < (p oV . □ 

Corollary 3.2. The time marginal laws of the processes Y and Z are tight. 

Proof. The previous result implies 

EViZt) = E{V{Zt)lviz,)<R) + HV{Zt)lviz,)>R) <R+^y 
So, for a constant A > 0, there exists a compact set K such that {V < A} G K and 
^(Zt eK)> WiZt) <A)>1- ^^^[^'^^ ^ 1. □ 

A A-+00 

3.2. Convergence in distribution towards the global minima of V . Remember 
that := goG~i(f) ^"^^ ^^^^ ~^ G^^(t))e^'^(t). The process Z reads 

(3.3) dZt = e{t)dBt - VVt{Zt)dt 

where we have defined Vt{x) := V{x) + Actually, we will prove that this non- 

homogeneous Markov process converges in distribution to its "invariant" probability mea- 
sure. Of course, if we suppose that a(t) = a and e(t) = e, then the convergence in 
distribution is well-known. 

Let Lt^s be the operator defined by Lt^^ '■= ^ (V\4, V) and P*'^ the associated 

semigroup. Define 

(3.4) n,,,(da;) := —L_e-^e--y^(^)^^^ 

where Z(t,e) := /j^d e~^^ ^^'*^^^dx. As |VVjp — AVt is bounded from below, the theory 
of Schrodinger operator implies that Lt,£ is self-adjoint in L^(nt^e) and admits a spectral 
gap. Furthermore, when |V\4p — AVt goes to the infinity as |a;| —>■ oo then the spectrum 
of Lie is discrete: = Xi(t,e) < —X2(t,e) < . . .. Heuristically, when the time is of 
order the transition density has a nice lower bound and the process is very close 

to the "invariant probability" Ilt,e- So, our main goal is to compute the convergence of 
the latter probability measure when t goes to the infinity. What is more, as the subspace 
corresponding to the first eigenvalue Xi{t,e) is composed by the constant functions, we 
find 

A2(t,£)=inf|y" |V0pdn,,,; Varn,,(0) = l,0GC°°(M'^)|. 

So, our first aim is to compute the eigenvalue A2 and study its behavior when t ^ 00 
(that is e ^ 0). 

Consider for a while Iloo,e '■= ^^^^dx, where Z{e) := /igd ^^^^dx. Let 

min = {mi, . . . , niq} be the set of the global minima of V. Hwang [IT] has established 
that Iloo.e converges weakly when e converges to zero and described the limit: 
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Lemma 3.3. When t goes to the infinity, the probability measure Iloo.e{t) converges 

weakly to 

l^'-^) n» - S (detVn^(.n,))-V^ E (detV=V(™.))-/.,„.. 

l<i<q 

Remark 3.4. We use here a weaker form than the result of Hwang, who has actually 
proved the convergence oflloo,e(t) for a more general set min. 

We can now state and prove the 

Proposition 3.5. Suppose that (asymptotically) g(t) > |logt. The probability measure 
^t,e{t) converges weakly to Uq as t goes to the infinity, where Uq is defined by (\3.5h . 

Proof Recall, that e'^{t)a{t) = 2(r + G"^(t)) and 

Z{t,e{t)) = [ e^2""'W^(")e~'^^w^dx. 



Let K be the compact set K := {x\V{x) < 1}. There exists a constant A > such that 
K C B{0,A). Then, on one hand, we get the upper bound 

e-^^-^W^We-^^dx < / e-2-^We"^^da; = (^^^(f)^'^ )'^' . 

On the other hand, we obtain similarly the lower bound 

IK J K 

By Laplace's method, we have (see [TT| [T7]) 

J t^+oo ^ — ^ 

^ i=l 

where (mj)j are the global minima of V (they form a finite set). On the other hand, as 
g{t) > I logt, we have that G~^{t) goes to the infinity and so 

^^^^d/2^-2e-Ht) = (G-\t)goG-\t)f'e-'<^°''"'^'^ 0. 

t— >oo 

As a consequence, we find the asymptotic equivalence 

Zit,e{t)) ~ 5^(7re^(t))'^/2(detVV(m,))-^/l 

i 

By the same method, if is a continuous function, with compact support containing only 
the global minimum mi, we find 

a;)e-2^-'W^(-)e"'Wy^dx ~ {ne\t)f/''{detV^V{mi))-^/^(f){mi). 

This concludes the proof and also give the explicit form of IIo. □ 
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To prove that Z converges in distribution towards the global minima of V ^ we follow the 
approach initiated by Holley and Stroock [TUj, HoUey, Kusuoka and Stroock [8J and Miclo 
[T5j . using some functional inequalities. We suppose in the following that g o G~^{t) is 
asymptotically equivalent to fclog(l + t) for k large enough. The remainder of the Section 
is the following. First, we will show that the measures ij\t,e(t)^ ^ > 0) satisfy a logarithmic 
Sobolev inequality. This will prove useful for the convergence of the free energy to zero. 
After that, we show that Z converges in distribution to a random variable of law Ho. 

Definition 3.6. The measure fi satisfies the logarithmic Sobolev inequality, with the con- 
stant C, denoted LSI{C), if for all function h G L'^{^), we have 



h'logh'dfx - hMfij log HMfij <C J \Vh\'dfx. 

Let p{s, X, t, y) denote the density of the semi-group corresponding to the non- homogeneous 
Markov process Z. We remind that we supposed V = W + x and c is the convexity con- 
stant of W. 

Lemma 3.7. The family of probability measures {Ilt^s{t),t > 0) satisfies a logarithmic 
Sobolev inequality LSI{C{t)), where C{t) = 2e^^ / c (and osc(x) = supx — inf x)- 

Proof. The point is to use the celebrated Bakry- Emery r2-criterion (see pj). Indeed, to 
the operator Lte^t), we can associate the operator "carre du champ": for all function 
f,9eC^ 

(3-6) ^I{f,9) ■= ^ {Lt,e{t){fg) - fLt,e(t)g - gLt,e{t)f) ■ 

Then, we define the operator as 

(3.7) r^(t)(/) := I (L,,(,)rr(/,/) -2rr(/,L,,(,)/)) . 

The r2-criterion asserts that if there exists a positive constant C such that F^' > CFJ^', 
then Ilt,e{t) satisfies a logarithmic Sobolev inequality, LSI{2/C). An easy calculation, for 
all function / of class C°°, leads to 

rr(/,/) = ^|v/p 

mm = ^(v/, vw/) + ^iivvip + |^iv/r 

Recall, the decomposition V = W + x where W is strictly convex with a constant c 
and X is a compactly supported function. We apply Bakry and Emery's criterion to the 
function W + \x\'^/a{t) and we get that Tf{t){f) > cFf (/). So, the probability measure 
g-2e-2(t){H/{x)+|x|2/a{t))/^ satisfies the inequality LSI{2/c). We conclude, by Holley and 
Stroock's perturbation lemma [1], that the measure ^t,eit) satisfies a LSI{C{t) inequality 

with < 2e^°''''/c. □ 



and 



SELF-INTERACTING DIFFUSIONS: CONVERGENCE IN DISTRIBUTION 9 

Let US denote by (Pj'^, s > 0) the C° semigroup corresponding to the non-homogeneous 
Markov process Z. For any function / and any probabihty measure [i (such that / is 
/i-integrable) , we note < / >^:= /jgd /d/i. 

Lemma 3.8. The probability measure Ut^e admits a spectral gap: there exists a constant 
X2{e) > such that for all s > 0, all continuous f G L'^ijlt^e) 

\\P'/f - n,,/|U.(n,,) < e~^^(*'^)^Varn,„(/). 

Proof. As ^t,s{t) satisfies the inequahty LSI{C{t)), we get the spectral inequahty with 
constant X2{t,e{t)) > 0. □ 

We want to use the previous functional inequalities in order to prove the convergence 
of Zf (and thus Yt) towards the global minima of V. 

Definition 3.9. The free energy (up to an additive constant), or relative Kullback infor- 
mation, of a measure P absolutely continuous with respect to IT is: H{P\U) := J dPlog 
Equivalently, if we suppose that P (respectively U) has a density p (respectively n) with 
respect to the Lebesgue measure X, then we define 

(3.8) H{p\U) := [ plog-dA. 



IT 

Proposition 3.10. For all initial to, xq, we get 
d 2 

—H{p{to,xo,t,-)\Ut,^^t)) < --^jj^e{tfH {p{to,xo,t,-)\Ut,^(^t)) 



Ae{t)e{t)-^ j p{to,Xo,t,-)iVt- < Vt >n,,,JdX 
j p{to,xo,t,-){Vt- < Vt >n,,^_(,))dA. 



e{ty 

Proof. We adapt the proof of Holley and Stroock [10] (and Miclo [I5]). In order to shorten 
notation, let pt := p{to, xo,t, ■) be the distribution law of the process Zt, knowing that 
Zto = xo (the existence of a density pt follows from Girsanov theorem). Recall, that the 
family of probability measures {Ilt^£(t),t > 0) satisfies a family of Sobolev logarithmic 
inequalities {LSI{C(t)),t > 0) and Ut^^^^t^^dx) = 7rt^e(j)(x)A(dx). Let us introduce 



hi - 



satisfying in particular J hfdUt^e^t) = 1- So, by Lemma 1X71 there exists a constant C{t) 



(3.9) H{pt\Ut,eit)) = [ Ptlog^dX < C{t) [ \Vht\MUt 

Computing the gradient function of /it, we get 
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We put this last estimate in the preceding inequahty to get 

2 



(3.10) 



H{pt\Ii 



< 



4 









1 Pt 


Pt 





Moreover, the time-derivative of the free energy H is 



(3.11) 



pt log — ^dA 



dA. 



Pt dA. 



T^t,e(t) 



Our strategy is to find a upper bound for both terms of (13. lip . Kolmogorov forward 
equation reads 

(3.12) Pt = ]^e{tf^pt + (Vpt, Wt) + div(l4)pt = V ■ {]^e{tfVpt + PtVVA . 



We have the following estimates on 'nt,e(t) 
'^t,e(t) , e(t) 



(3.13) 
(3.14) 



'^t,sit) 



e(t)3 



4-^ ( Vt- < Vt >.,,^,, 



-2 



S/Vt 



{Vt- < Vt >n 



Putting all the pieces together, integrating by parts and using (13.101) . we get 



Pt log dA 

T^t,e{t) 



log^V • (]-e{tfWpt+PtVVt ) dA 

^t,e{t) V2 

Vpt yT^t,e{t) 1 . ,.2v7 , \ A\ 

^Pt+PtWt dA 



Pt 



< 



2 
2 



Pt 



TTt,e{t) 2 

^Pt ^^VV 



Pt 



e{ty 



dA 



e{tfH{pt\U 



t,e{t)j 



On the other hand, we obtain for the second term of (13.111) : 

Pt^dA = 4^ j pt{Vt- < V >n,,„)dA - ^ / PtiYt- < V >n,,,,)dA. 

Putting all the pieces together in (13. lip leads to the result. 
Lemma 3.11. For allt > 0, the quantity < |xp >nt ^(jj is bounded. 



□ 



Proof. Let K be the compact set K := {x|V(x) < rj} where 77 is a given positive constant. 
As ^t,e{t) converges weakly to Ho we only need to prove that < Ixplx'^ >nt ^(t, is bounded. 
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For any 1 > e{t) , we have 



|^|2g-2£-2(t)Vt(x)^^ ^ / |^|2g-2V{x)g-2V(x)(£-2(t)-l)^^ 



But, as proved in Proposition 13.51 we have the asymptotic equivalence 
Z{t,e{t)) ~ V(7^5^(^))'^/2(detVVK))-'/^ 

too ' 

i 

which imphes < >nt,m< 55-^(t)e~2'?^"'W -^0. □ 



t— »oo 



In order to conclude the convergence of the free energy to zero, we will use an easy 
result of Miclo flSl 



Lemma 3.12. (Miclo, lemma 6) Let f : [0,oo[^ R+ be a continuous function such that 



a.s. 



f(t) <«t-A/(t), 



where a and (3 are two continuous non-negative functions such that /?tdt = oo and 
\\mat/[3t = 0. Then lim/(t) = 0. 



t— >oo t—*oo 



Proof. One has to adapt Gronwall's lemma. Let g(t) = /(t) exp /3sdsj , which is 
continuous. We get a.s. 



We therefore obtain that 



g'{t) < at exp Psds^ . 



Consequently, for all to > 0, we have 



fit) < /(0)e- ^0 + e" ^0 I jT " ^^g/o" PuAu ds + asc-^o P-^^ dsj . 

Let ?7 > 0. We choose to such that for all t > to, we have as < rjPs- We thus find a upper 
bound, for any t > to, for the last term of the preceding inequality: 



As the two first terms go to zero when t goes to the infinity, we get limsup/(t) <r]. □ 

►CX3 
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Corollary 3.13. If lim — = and lim ^e-^o ^s'^* = +oo, then we have asvmp- 
totically ( when t ^ oo) 

Pt 

Proof. Integrating by parts, we have 

Jo A /?o Jo \asf3s PtJ 

Theorem 3.14. Suppose that e'^{t) = k/log{t), with oo > k > 2osc(x). Then, for all 
initial to, Xq, the free energy H {p{to, Xq, t, ■)\Ilt,s{t)) converges to (as t ^ oo). 

Proof. Let to >0 and xq G M'^. Consider the process Zt, solution to the SDE 

dZt = e{t)dWt - (^VV{Zt) + dt, Zt, = xo. 

The result of Proposition 13.101 can be rewritten in the following way: 

- Aemt)-' {miZ,)- < V, >n,,,(,)) • 

As V{x) > c|xp out of a compact set and Z is tight (by Corollary 13 .2^ . we have KVt{Zt) = 
0(1). Because t a{t) is nondecreasing while t \—>- e{t) is nonincreasing, and as Vt{x) = 
— the two terms E,{Vt{Zt)) and < Vt >nt^(t) are nonpositive. So, it only remains 

to find a upper bound for < Vt >Ut^(^-ty Indeed, by Lemma [3.111 there exist Mi, M2 > 
such that 

'^:if(p,|n,,(,)) < -^e(t)2/7(p,|n,,(,)) + Mi4^ + M2 ""^^^ 



dt - C{t) ' ' e{ty e{tya{ty 

We easily compute the time- derivative of a{t): 

a{t) _ 2e{t) ^ 1 



a{tYe\t) e\t){r + G-\t)) {g o G-\t)){r + G-\t)Ye\t) 

1 logt 
kt{r + G-\t)) ^ k{g o G-^{t)){r + G-^t)^' 

As G^^{t) is a nondecreasing function and because of the hypothesis on k, the term 
kt{r+G-^(t)) converges to when t goes to the infinity. For the second term, we recall that 
\ogG{t) / g{t) is bounded by assumption: there exist two positive constants m, M such 
that mg{t) < logG(t) < Mg{t). So, we get mg{t) < \og(tg{t)) = logt + log^(t), what 
naturally implies that g{t) = O(logt) and then G{t) < tg{t) = oit^). So, 

G{tf°'<^)l^\ogG{t)/{kg{t){r + tf) — > 0. 

t— +00 
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Lemma 13.121 asserts that if e satisfies 

I ^(t) TTTT = and , — > 0, 

then \imH{pt\Ilt ^u)) = 0. We meet the required conditions with = k/\ogt. □ 

Remark 3.15. The constant k is not optimal here, because we have used the perturbation 
lemma of Holley and Stroock for the estimation of the logarithmic Sobolev constant. 

Lemma 3.16. The speed of convergence of H{p{to, xo,t, ■)\Ilt,eit)) toward is ^2-2o!c(x)/fc ■ 

Proof. As we satisfy the hypothesis of Corollary 13 .131 the speed of convergence is ^, with 
at = {r + + (t(r + G-\t))y^ and A = t-^°'<^^/'' / logt. So, we get 

at _ t2osc(x)AiQg^ iQg^ 
Jt ~ {r + G-^it)y ^ ti-2osc(x)/fe(^ + G'-i(t))' 

As, up to a multiplicative constant, g o G^^{t) = logt, we get that g(t) = O(logt) and 
G~^{t) is of the order of t/ logt. So at/ Pt is asymptotically of the order of (logt)^t^°'^'^*^^^/^"^. 

□ 

Remark 3.17. It is known since the work of Freidlin and Wentzell [6] (see [5], Chap. 5), 
that the Gibbs measure '^t,e{t) satisfies a large deviation principle. Therefore, the speed of 
convergence ofYit,e(t) toward Hq is e~^°^*/^'^. 

Corollary 3.18. Suppose thate'^{t) = k/log{t), with oo > k > max{2 osc(x), c?/4}. Then 
Z and Y converge in distribution to a random variable, which law is Uq fl3.5l) . 

Proof. The KuUback information H{pt\Ilt,e{t)) estimates the distance between pt and ^t,e{t) 
in the following way: \\pt — Ilt,£(t)\\j'v — '^H{Pt\^t,£{t)), where || ■ \\tv denotes the total 
variation norm (see [ID|)- The result follows because ^t,e{t) converges weakly to Ho and 
the total-variation norm charaterizes the weak convergence of measures. □ 

Remark 3.19. We emphasize that if lim5f(t)~-'^ log G'(t) = k, with k not large enough, 

then the previous result is false and Y does not converge toward the global minima of 
V (except if each local minimum of V is a global minimum). Indeed, in the simulated 
annealing theory, if e decreases too fast to zero, then the process Y freezes in a local 
minimum, the choose of the minimum depending on the initial value Yq = x — JI. 

4. The process X: convergence in distribution 

We give here sufficient conditions for the convergence of X. As usual, we begin to work 
with the process Yt = Xt — JIf In order to link this section with the preceding one, we 
recall that eit)"^ = {g ° G~^(t))~~^ = k/\ogt. So, we consider only functions g such that 
(asymptotically) logG(t) = kg(t). By Theorem 12.31 there exist flj > such that = 1 
and for all continuous bounded /: ^ f{Ys)ds — > ^ai/(mj). 
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Theorem 4.1. Suppose that oo > lim g{t)^^ log G{t) = k > max{2 osc(x), 'i/4}. Then 

t— ►oo 

one of the following holds: 

(1) If V is a function such that ^ airrii = 0, then Xt converges in distribution to 

l<i<n 

V J- r°° V ■ 

(2) Else, Xt diverges. 

Proof. We wonder whether JqY^:^ converges in distribution or not. Remark, that 

/"* ds 

Xt = Yt + / —Ys = Yt + -[lf 
Jo r + s 

Lemma [3?T] ensures that the duality for the weak convergence is true for functions bounded 
by V, so in particular for /(x) = x. Suppose that V is such that ^ ajr/ij = 0. Then, 

l<i<n 

we know that ^ Ygds and it remains to find the rate of convergence in order to 
conclude the proof. But, Benai'm and Schreiber p] (Theorem 1) have proved that, for an 
asymptotic pseudotrajectory (in probability) Y, the speed of convergence of the mean of 
the normalized occupation measure of Y is the same as the speed of the pseudotrajectory. 
This means that the speed of convergence of the normalized occupation measure of the 
time-changed process iG-i(t) is G~^{1 +t) — G~^(t). Integrating by parts, we obtain 

1 /■* 1 /''^(*) 1 /"^*^*^ g'oQ-^iu) r 



The first right-hand term converges to because G{t) < tg{t). It remains to prove the 
convergence of the second term. We have that (because, up to a multiplicative constant, 
goG-\u) = log(2 + M)) 



7 / du n-u Ws 'l^ \s + T)-G \s)]ds 
t Jo {9°G ^{u)y 

^ G{t){t-G-\G{t)+T) ^ 1 r^ G-\s + T)-G-\s) ^^ ^ _ 



tgit) tJo goG-^is) 

So, if is a function such that Yl (^i^i ^ 0, then Ysds does not converge. Suppose 

l<i<n 

that Yl (^i^i = 0. Let Ut := Jlf and Vj := Xf. We conclude, because the celebrated 

l<i<n 

Slutsky theorem asserts that for two sequences of M'^-valued random variables (Ut) and 
(Vt), with Ut and \Ut - Vt\ ^ 0, then Vt ^ U. □ 

Remark 4.2. The condition on g means that, asymptotically, g{t) = klogt for k large 
enough. 
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5. The case g(t) = 1 

We give the proof for IR but it is easily reproduced in W''. We suppose now that g{t) — 1. 
In order to study the behavior of the process X solution of 

dXt = dBt - V'{Xt - flt)dt, 

we introduce the process Yt :— Xt — fit, solution of the SDE 

dYt = dBt - V'{Yt)dt - -^dt. 

r + t 

We also introduce the Kolmogorov process Z solution to 

dZt = dBt - V\Zt)dt. 

This process is a positive recurrent diffusion. Denote by 7 its invariant probability mea- 
sure, 7(da;) = j^-^T^j^dx, and 7 is the mean of 7. For all h e -£'^(7) we have, with an 
exponential speed of convergence, 

lim - / h{Zs)ds= [ hd'y a.s. 

Lemma 5.1. The process Y converges in probability to a random variable Y^ of density 
7 when t goes to +00. 

Proof. Using that —y{y — z) < ^ for all 2; e M, we get 

1 d_ 
2'dt 



-^{Yt-Ztf = -¥.{V'{Yt)-V'{Zt),Yt-Zt)-¥.{^-^^,Yt-Z^ 



< -[c + C)^{Yt-Ztf + ^nZt)- 

r + t 

Applying Ito's formula, it is easy to prove the existence of M > such that for alH > 0, 
) < M. So, ^{Yt — ZtY goes to zero. We choose the random variable Zq, which 
distribution function is 7, so tha the law of Zt is 7 for all t. So, Y converges in to a 
random variable of law 7. □ 

Lemma 5.2. For all A > 0, we have lim sup {Yf,t+u — 7) dtil = a.s. 

t^oo o<s<A 

Proof. Fix T > 0. We note Y"^ the process defined by Y^ := Yg+r- It is solution to 

dy/ = dBj - V'{Y,^)ds - ^ds. 
Recall, that 7 = J^x^i'{x)dx. Let A> 0, we have for all s < A and T — e*: 

Jo Jo v + T Jo v + T 

where the process Z^ is the solution to the SDE 

dZj = dBj - V'{Zj)ds, Zl = Yt. 
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1) Study of J. Integrating by part, we have with S* = — 1, 



J 



TS 



TS + T \TS 



1 



TS 



— I z:du-^] + 



TS 



iv + T)^\v 



Z^du — J \ dv. 



The process Z satisfies the hmit-quotient theorem, so the first right-hand term converges 
a.s. to when T goes to the infinity. For the second right-hand term we have 



TS 



Z^du — 7 ) df 



1 

< — 

— 2^2 



TS 



Zldu - 7 



dv. 



By Cesaro, J converges to as T goes to the infinity. 

2) Study of I. We use the estimate of the distance between the processes Yj and 
for T large enough, as in Lemma [57T1 So, letting C := 2(c — C) > 0, we find the following 
upper bound 



T 

V 



u + T 



du. 



Let (Too := liKi^ /o(^J)^dM. Remind, that there exists a > such that the speed of 
convergence is less than e~"*, so that we have 



f „Cu(ryT\2 



u + T 



-du 



u + T 



du + 



V ,,Cu((ryT\2 



U + T 



-du =: K + L. 



With probability 1, we obtain the upper bounds \K\ < 



aT 



and I LI < 



aT ' 



implying 



{Y^ — Zjy < a.s. So, |/| < ^ and I converges to zero as T goes to the infinity. □ 
We have know to study the asymptotic behavior of fit- Integrating by parts, we get 



(5.1) 



r + s 



ds = /io + 



t 



r + t 



mt + 



\r + sY 



where rrit := j J^Ygds. 

The proof of the main result of this section is based on the following: 

Proposition 5.3. The process rrit converges in probability to 7. Moreover the speed of 
convergence of rrit toward 7 is less than j . 

Proof. As dmt = ]:(Yt — mjdt, letting Ut := rriet, we get = Y^t — n^. Consequently, 
we find 

ft+S 



/ (Yeu - riu) du= {-nt+u + du + / (Y^ 
Jt Jo Jo 



7) du. 



Let et{s) := J^(Yet+n — 7) du. By Lemma [5^ for all A > we have lim sup \et{s)\ = 

a.s., so nt is an asymptotic pseudotrajectory (a.s.) for the fiow generated by 

dil)tix) 



dt 



7 - iptix), ipo{x) = X. 



As this fiow admits only one limit point which is exponentially attracted, the cu-limit set 



of Ut is reduced to {7}. So, mt = ni^gt converges a.s. to moo := 7 



□ 
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Corollary 5.4. (1) 7/7 = 0, then ftt converges in probability as t goes to infinity. 
(2) 7/7 7^ 0, then fit diverges and lim =^ (in probability). 

Proof. By flS.ip . we get 

t , . f s 



Ht = fJ-o + ——{mt - l) + / , . - 7)dg + 7(log(l + t/r) - r) ■ □ 
r + t 7o V + -^J 

Summarizing, we have now proved the following. 

Theorem 5.5. One of the following holds: 

(1) 7/7 = 0, then fit converges in probability to ftoo o,nd Xt converges in probability to 
Yoo + fioo, where the law ofY^o has the density 7 ; 

(2) Else, Xt diverges. 

6. Appendix 

In Section [3l the optimal annealing schedule is not obtained. Actually, k should be 
directly related to m, as HoUey, Kusuoka and Stroock [8] proved it for a{t) = 00. Let 
zq G M'^ such that Vt{zQ) = 0. Let K be the compact support of x- 

Definition 6.1. The maximal height of the function Vt is the non-negative function m{t) 
defined by 

(6.1) m{t) := sup{77f(x, Zq) - Vt{x); x G K}, 

where Ht{x,z) := inf{Et(7); 7 G C^([0, l],7s:),7(0) = x,7(l) = z} and 
Et{^):=sup{Vt{^{u)y,ue[OA]}. 

Remark, that m{t) does not depend on zq and so, we choose Zq = 0. The function m{t) 
corresponds to the maximum of all the minimal energies one needs to go from each point 
of to Zq. It is positive if and only if there exists several local minima. 

Lemma 6.2. We have that limm(t) = m, where m is the maximal height function 
corresponding to V. 



t— >oo 



Proof. Let M := sup{|xp;x G K}. For all path 7, we have clearly Etij) < E{j) + 



Then, by definition we get |77t(x, 0) — 77oo(x, 0)| < Consequently, there exists C > 



such that \m{t) — m(oo)| < □ 

Jacquot relates in [13] the height function to the second eigenvalue of the infinitesimal 
generator of Y'^ (that is the constant involved in the spectral gap inequality). He proves 
that lime^ log A2(oo, e) = — 2m(oo). So, the "critical" value of k should be m instead of 
oscx- 
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